Search CORE

110 research outputs found

ProbCD: enrichment analysis accounting for categorization uncertainty

Author: A Lewin
A Vinayagam
B Engelhardt
C Andersson
C Jones
D Martin
E Levy
I Rivals
Ilya Shmulevich
J Goeman
L Goodman
M Aubry
P Shannon
R Fisher
R Sealfon
R Vencio
Ricardo ZN Vêncio
S Carroll
S Maere
T Joshi
W Zhang
W Zhang
Z Jiang
Publication venue
Publication date: 01/01/2007
Field of study

As in many other areas of science, systems biology makes extensive use of statistical association and significance estimates in contingency tables, a type of categorical data analysis known in this field as enrichment (also over-representation or enhancement) analysis. In spite of efforts to create probabilistic annotations, especially in the Gene Ontology context, or to deal with uncertainty in high throughput-based datasets, current enrichment methods largely ignore this probabilistic information since they are mainly based on variants of the Fisher Exact Test. We developed an open-source R package to deal with probabilistic categorical data analysis, ProbCD, that does not require a static contingency table. The contingency table for
the enrichment problem is built using the expectation of a Bernoulli Scheme stochastic process given the categorization probabilities. An on-line interface was created to allow usage by non-programmers and is available at: http://xerad.systemsbiology.net/ProbCD/. We present an analysis framework and software tools to address the issue of uncertainty in categorical data analysis. In particular, concerning the enrichment analysis, ProbCD can accommodate: (i) the stochastic nature of the high-throughput experimental techniques and (ii) probabilistic gene annotation

arXiv.org e-Print Archive

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Nature Precedings

Glutamate 301 of the mouse gonadotropin-releasing hormone receptor confers specificity for arginine 8 of mammalian gonadotropin-releasing hormone

Author: Becker I I
Davidson J S
Flanagan C A
Millar R P
Sealfon S C
Wakefield I K
Zhou W
Publication venue: 'University of Sarajevo Faculty of Health Sciences'
Publication date: 01/01/1994
Field of study

The Arg residue at position 8 of mammalian GnRH is necessary for high affinity binding to mammalian GnRH receptors. This requirement has been postulated to derive from an electrostatic interaction of Arg8 with a negatively charged receptor residue. In order to identify such a residue, 8 conserved acidic residues of the mouse GnRH receptor were mutated to isosteric Asn or Gln. Mutant receptors were tested for decreased preference for Arg8-containing ligands by ligand binding and inositol phosphate production. One of the mutants, in which the Glu301 residue was mutated to Gln, exhibited a 56-fold decrease in apparent affinity for mammalian GnRH. The mutant receptor also exhibited decreased affinity for [Lys8]GnRH, but its affinity for [Gln8]GnRH was unchanged compared with the wild type receptor. The apparent affinity of the mutant receptor for the acidic analogue, [Glu8]GnRH, was increased more than 10-fold. The mutant receptor did not, therefore, distinguish mammalian GnRH from analogues with amino acid substitutions at position 8 as effectively as the wild type receptor. This loss of discrimination was specific for the residue at position 8, because the mutant receptor did distinguish mammalian GnRH from analogues with favorable substitutions at positions 5, 6, and 7. These findings show that Glu301 of the GnRH receptor plays a role in receptor recognition of Arg8 in the ligand and are consistent with an electrostatic interaction between these 2 residues

Cape Town University OpenUCT

Proteome Profiling of Breast Tumors by Gel Electrophoresis and Nanoscale Electrospray Ionization Mass Spectrometry

Author: Aebersold R.
Boyle E. I.
Chong P. K.
Cottingham K.
Cove D. H.
Ezeh U. I.
Foran E.
Gygi S. P.
Ioachim E.
Jacobs J. M.
Keller A.
Laemmli U. K.
Lapillonne A.
Lin C. S.
Liu H.
Maclean B.
Mathelin C.
Metodiev M. V.
old W. M.
Patterson S. D.
Pedrioli P. G.
Rauch A.
Schmidt A.
Sealfon R. S.
Seo J.
Sheen-Cheen S. M.
Silva J.
Taylor M. R.
Toi M.
Wolters D. A.
Zhang J.
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2008
Field of study

We have conducted proteome-wide analysis of fresh surgery specimens derived from breast cancer patients, using an approach that integrates size-based intact protein fractionation, nanoscale liquid separation of peptides, electrospray ion trap mass spectrometry, and bioinformatics. Through this approach, we have acquired a large amount of peptide fragmentation spectra from size-resolved fractions of the proteomes of several breast tumors, tissue peripheral to the tumor, and samples from patients undergoing noncancer surgery. Label-free quantitation was used to generate protein abundance maps for each proteome and perform comparative analyses. The mass spectrometry data revealed distinct qualitative and quantitative patterns distinguishing the tumors from healthy tissue as well as differences between metastatic and non-metastatic human breast cancers including many established and potential novel candidate protein biomarkers. Selected proteins were evaluated by Western blotting using tumors grouped according to histological grade, size, and receptor expression but differing in nodal status. Immunohistochemical analysis of a wide panel of breast tumors was conducted to assess expression in different types of breast cancers and the cellular distribution of the candidate proteins. These experiments provided further insights and an independent validation of the data obtained by mass spectrometry and revealed the potential of this approach for establishing multimodal markers for early metastasis, therapy outcomes, prognosis, and diagnosis in the future. © 2008 American Chemical Society

University of Essex Research Repository

Crossref

GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists

Author: A Subramanian
B Zeeberg
B Zhang
C Backes
Doron Lipson
E Eden
E Gansner
EI Boyle
Eran Eden
F Al-Shahrour
F Al-Shahrour
GD Jr
Israel Steinfeld
JJJ Goeman
LJ van't Veer
M Ashburner
P Khatri
Q Xu
QWX Zheng
R Breitling
R Sealfon
Roy Navon
S Maere
TST Beissbarth
Zohar Yakhini
Publication venue: BioMed Central
Publication date: 01/02/2009
Field of study

Abstract Background Since the inception of the GO annotation project, a variety of tools have been developed that support exploring and searching the GO database. In particular, a variety of tools that perform GO enrichment analysis are currently available. Most of these tools require as input a target set of genes and a background set and seek enrichment in the target set compared to the background set. A few tools also exist that support analyzing ranked lists. The latter typically rely on simulations or on union-bound correction for assigning statistical significance to the results. Results <it>GOrilla </it>is a web-based application that identifies enriched GO terms in ranked lists of genes, without requiring the user to provide explicit target and background sets. This is particularly useful in many typical cases where genomic data may be naturally represented as a ranked list of genes (e.g. by level of expression or of differential expression). <it>GOrilla </it>employs a flexible threshold statistical approach to discover GO terms that are significantly enriched at the <it>top </it>of a ranked gene list. Building on a complete theoretical characterization of the underlying distribution, called mHG, <it>GOrilla </it>computes an exact p-value for the observed enrichment, taking threshold multiple testing into account without the need for simulations. This enables rigorous statistical analysis of thousand of genes and thousands of GO terms in order of seconds. The output of the enrichment analysis is visualized as a hierarchical structure, providing a clear view of the relations between enriched GO terms. Conclusion <it>GOrilla </it>is an efficient GO analysis tool with unique features that make a useful addition to the existing repertoire of GO enrichment tools. <it>GOrilla</it>'s unique features and advantages over other threshold free enrichment tools include rigorous statistics, fast running time and an effective graphical representation. <it>GOrilla </it>is publicly available at: <url>http://cbl-gorilla.cs.technion.ac.il</url></p

Crossref

Directory of Open Access Journals

PubMed Central

The acceleration of the universe and the physics behind it

Author: A. Amendola
A. Hindawi
A. Lue
A. Lue
A. Nicolis
A. Riazuelo
A. Riazuelo
A. Sen
A. Sen
A.G. Riess
A.R. Liddle
A.R. Liddle
A.Y. Kamenshchik
B. Basset
B. Boisseau
B. Carter
B. Ratra
B.A. Basset
B.A. Bassett
B.A. Bassett
C. Alcock
C. Armendariz-Picon
C. Csaki
C. Deffayet
C. Deffayet
C. Deffayet
C. Hirata
C. Schimd
C. Sealfon
C. Wetterich
C. Wetterich
C.M. Will
D. Eisenstein
D. Gorbunov
D. Sorkin
D. Wands
E.R. Siegel
E.T. Tomboulis
E.V. Linder
E.V. Linder
E.V. Linder
E.V. Linder
F. Perotta
G. Dvali
G. Esposito-Farèse
G. Geshnizjani
G. Lemaître
G.F.R. Ellis
G.F.R. Ellis
G.F.R. Ellis
G.F.R. Ellis
H. Bondi
H. Iguchi
I. Kogan
J. Khoury
J. Martin
J.-P. Uzan
J.-P. Uzan
J.-P. Uzan
J.-P. Uzan
J.-P. Uzan
J.-P. Uzan
J.-P. Uzan
J.-P. Uzan
Jean-Philippe Uzan
K. Koyama
K. Koyama
K. Koyama
K. Stelle
K. Tomita
K. Tomita
L. Amendola
L. Amendola
L. Amendola
L. Perivolaropoulos
M. Bento
M. Chevallier
M. Giovannini
M. Malquarti
M. Ostrogradski
M. Tegmark
M. Tegmark
M.-N. Célérier
N. Bilic
N. Mustapha
P. Teyssandier
P.J.E. Peebles
P.S. Corasaniti
R. Dave
R. Gregory
R. Maartens
R. Rasanen
R. Rasanen
R. Tolman
R.R. Caldwell
S. Cole
S. Gottlöber
S. Srianand
T. Buchert
T. Chiba
T. Chiba
T. Damour
T. Damour
T. Damour
T. Damour
T. Padmanabhan
V. Sahni
W. Hu
Z.K. Guo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/05/2006
Field of study

Using a general classification of dark enegy models in four classes, we discuss the complementarity of cosmological observations to tackle down the physics beyond the acceleration of our universe. We discuss the tests distinguishing the four classes and then focus on the dynamics of the perturbations in the Newtonian regime. We also exhibit explicitely models that have identical predictions for a subset of observations.Comment: 18 pages, 18 figure

arXiv.org e-Print Archive

Crossref

Planck 2013 results. XXII. Constraints on inflation

Author: A. Benoit-Lévy
A. Benoît
A. Bonaldi
A. Catalano
A. Challinor
A. Chamballu
A. Coulais
A. Curto
A. de Rosa
A. Gregorio
A. Gruppuso
A. H. Jaffe
A. Hornstrup
A. J. Banday
A. Lasenby
A. Lewis
A. Lähteenmäki
A. Melchiorri
A. Mennella
A. Moneti
A. Moss
A. Wilkinson
A. Zacchei
A. Zonca
A.-S. Suur-Uski
Abbott
Achúcarro
Acquaviva
Adams
Adams
Adshead
Adshead
Agarwal
Akaike
Albrecht
Alishahiha
Allahverdi
Amendola
Anantua
Anderson
Armendáriz-Picón
Audren
Aver
Axenides
B. D. Wandelt
B. Maffei
B. P. Crill
B. Partridge
B. Rusholme
B. Van Tent
Babich
Bae
Banks
Bardeen
Barrow
Barrow
Bartolo
Bartolo
Bartolo
Bartolo
Barvinsky
Baumann
Bean
Bean
Bean
Beltran
Beltran
Benetti
Bennett
Bennett
Beutler
Bezrukov
Bezrukov
Bezrukov
Binétruy
Blas
Boubekeur
Bozza
Bridges
Bridle
Brout
Bucher
Bucher
Bucher
Bucher
Bucher
Bunch
Burgess
Byrnes
C. A. Oxborrow
C. Armitage-Caplan
C. B. Netterfield
C. Baccigalupi
C. Burigana
C. Dickinson
C. Gauthier
C. Hernández-Monteagudo
C. R. Lawrence
C. Renault
C. Rosset
Casadio
Chen
Chen
Chen
Chen
Cheung
Chung
Chung
Coleman
Conley
Contaldi
Cortês
Cortês
Covi
Cox
D. Hanson
D. Harrison
D. Herranz
D. J. Marshall
D. L. Clements
D. Maino
D. Mortlock
D. Munshi
D. Novikov
D. Paoletti
D. Pietrobon
D. Santos
D. Scott
D. Sutton
D. Tavagnacco
D. Yvon
Danielsson
Das
de Carlos
Dodelson
Dunkley
Dunkley
Dunkley
Dunkley
Dvali
E. Battaner
E. Calabrese
E. Franceschi
E. Hivon
E. Keihänen
E. Martínez-González
E. P. S. Shellard
E. Pierpaoli
E. Pointecouteau
Easther
Easther
Easther
Easther
Efstathiou
Efstathiou
Elgarøy
Enqvist
Eriksen
F. Atrio-Barandela
F. Couchot
F. Cuttaia
F. Finelli
F. K. Hansen
F. Matthai
F. Nati
F. Noviello
F. Paci
F. Pajot
F. Pasian
F. Perrotta
F. Piacentini
F. R. Bouchet
F. Sureau
F. Villa
F.-X. Désert
Fabbri
Fakir
Feroz
Feroz
Finelli
Fixsen
Flauger
Freese
Freivogel
G. de Zotti
G. Efstathiou
G. Giardino
G. Lagache
G. Morgante
G. Patanchon
G. Polenta
G. Prézeau
G. Rocha
G. Roudier
G. Savini
G. W. Pratt
García-Bellido
Garriga
Garriga
Garriga
Gauthier
Gong
Gordon
Gordon
Gordon
Gott
Gott
Gratton
Gratton
Gratton
Grishchuk
Groot Nibbelink
Guth
Guth
Guth
H. C. Chiang
H. Dole
H. K. Eriksen
H. Kurki-Suonio
H. U. Nørgaard-Nielsen
H. V. Peiris
Habib
Hamann
Hamann
Hamann
Hamann
Hamann
Hamimeche
Hannestad
Harrison
Hawking
Hawking
Hertog
Hinshaw
Hou
Hou
Hunt
I. J. O’Dwyer
I. Novikov
I. Ristorcelli
Ichikawa
Ichiki
J. A. Murphy
J. A. Rubiño-Martín
J. A. Tauber
J. Aumont
J. Bobin
J. Borrill
J. Delabrouille
J. Dunkley
J. F. Macías-Pérez
J. G. Bartlett
J. González-Nuevo
J. Hamann
J. J. Bock
J. Knoche
J. Lesgourgues
J. M. Diego
J. P. Leahy
J. P. Rachen
J. P. Zibin
J. R. Bond
J. Tréguer-Goudineau
J. Tuovinen
J. Valiviita
J. Varis
J.-F. Cardoso
J.-F. Sygnet
J.-L. Puget
J.-L. Starck
J.-M. Delouis
J.-M. Lamarre
J.-P. Bernard
K. Benabed
K. Ganga
K. M. Górski
K. M. Huffenberger
Kazanas
Keskitalo
Kim
Kinney
Kinney
Kinney
Kleban
Knox
Kobayashi
Kobayashi
Kofman
Kofman
Kofman
Komatsu
Komatsu
Kosowsky
Kurki-Suonio
L. A. Wade
L. D. Spencer
L. Danese
L. Knox
L. Mendes
L. Montier
L. P. L. Colombo
L. Pagano
L. Perotto
L. Popa
L. Terenzi
L. Toffolatti
L. Valenziano
L.-Y. Chiang
Langlois
Langlois
Larson
Leach
Lesgourgues
Lesgourgues
Lewis
Liddle
Liddle
Liddle
Lidsey
Lifshitz
Lifshitz
Linde
Linde
Linde
Linde
Linde
Linde
Linde
Linde
Linde
Lorenz
Lucchin
Lucchin
Lucy
Lyth
Lyth
Lyth
Lyth
Lyth
Lyth
Lyth
Lyth
M. Arnaud
M. Ashdown
M. Bersanelli
M. Bridges
M. Bucher
M. D. Seiffert
M. Douspis
M. Frailis
M. Giard
M. Hobson
M. Juvela
M. Kunz
M. Liguori
M. Linden-Vørnle
M. López-Caniego
M. Maris
M. Massardi
M. Migliaccio
M. Piat
M. Reinecke
M. Remazeilles
M. Rowan-Robinson
M. Sandri
M. Savelainen
M. Tomasi
M. Tristram
M. Tucci
M. White
M.-A. Miville-Deschênes
Ma
Maldacena
Martin
Martin
Martin
Martin
Martin
McAllister
Meerburg
Mehta
Mollerach
Moodley
Moroi
Mortonson
Mortonson
Mortonson
Mortonson
Mukhanov
Mukhanov
Mukhanov
Mukhanov
Mukhanov
Muslimov
N. Aghanim
N. Bartolo
N. Mandolesi
N. Ponthieu
N. Vittorio
Nagata
Nagata
Norena
O. Doré
O. Forni
O. Perdereau
Okada
Okamoto
Olive
P. A. R. Ade
P. B. Lilje
P. Bielewicz
P. de Bernardis
P. G. Martin
P. M. Lubin
P. Mazzotta
P. Naselsky
P. Natoli
P. R. Christensen
P. R. Meinhold
P. Vielva
Padmanabhan
Page
Pahud
Pallis
Pandolfi
Peccei
Peebles
Peiris
Peiris
Peiris
Peiris
Peiris
Peiris
Peiris
Planck Collaboration I
Planck Collaboration II
Planck Collaboration III
Planck Collaboration IV
Planck Collaboration IX
Planck Collaboration V
Planck Collaboration VI
Planck Collaboration VII
Planck Collaboration VIII
Planck Collaboration X
Planck Collaboration XI
Planck Collaboration XII
Planck Collaboration XIII
Planck Collaboration XIV
Planck Collaboration XIX
Planck Collaboration XV
Planck Collaboration XVI
Planck Collaboration XVII
Planck Collaboration XVIII
Planck Collaboration XX
Planck Collaboration XXI
Planck Collaboration XXII
Planck Collaboration XXIII
Planck Collaboration XXIV
Planck Collaboration XXIX
Planck Collaboration XXV
Planck Collaboration XXVI
Planck Collaboration XXVII
Planck Collaboration XXVIII
Planck Collaboration XXX
Planck Collaboration XXXI
Polarski
Powell
Powell
Preskill
R. B. Barreiro
R. C. Butler
R. D. Davies
R. J. Davis
R. J. Laureijs
R. Keskitalo
R. Kneissl
R. Leonardi
R. Paladini
R. Rebolo
R. Stompor
R. Sudiwala
R. Sunyaev
Raffelt
Ratra
Ratra
Reichardt
Richardson
Riess
Rubakov
S. Church
S. Colombi
S. Donzelli
S. Galeotta
S. Gratton
S. Henrot-Versillé
S. Leach
S. Masi
S. Matarrese
S. Mitra
S. Osborne
S. Pandolfi
S. Plaszczynski
S. Prunet
S. R. Hildebrandt
S. Ricciardi
Salopek
Sasaki
Sasaki
Sasaki
Sasaki
Sato
Savage
Schwarz
Sealfon
Seckel
Senatore
Shafieloo
Shafieloo
Sievers
Sikivie
Silverstein
Silverstein
Sinha
Spergel
Spergel
Spokoiny
Spokoiny
Starobinsky
Starobinsky
Starobinsky
Starobinsky
Starobinsky
Starobinsky
Starobinsky
Starobinsky
Steinhardt
Stewart
Stompor
Story
Suzuki
T. A. Enßlin
T. Poutanen
T. R. Jaffe
T. Riller
T. S. Kisner
Tanaka
Tocchini-Valentini
Tocchini-Valentini
Traschen
Trotta
Trotta
Tsujikawa
Turner
Turner
Turner
Turner
V. Stolyarov
Valiviita
Valiviita
Verde
Vilenkin
Vázquez
W. A. Holmes
W. C. Jones
W. Hovest
Weinberg
X. Dupac
Y. Giraud-Héraud
Yamamoto
Zeldovich
Publication venue: 'EDP Sciences'
Publication date: 20/03/2013
Field of study

We analyse the implications of the Planck data for cosmic inflation. The Planck nominal mission temperature anisotropy measurements, combined with the WMAP large-angle polarization, constrain the scalar spectral index to be ns = 0:9603 _ 0:0073, ruling out exact scale invariance at over 5_: Planck establishes an upper bound on the tensor-to-scalar ratio of r < 0:11 (95% CL). The Planck data thus shrink the space of allowed standard inflationary models, preferring potentials with V00 < 0. Exponential potential models, the simplest hybrid inflationary models, and monomial potential models of degree n _ 2 do not provide a good fit to the data. Planck does not find statistically significant running of the scalar spectral index, obtaining dns=dln k = 0:0134 _ 0:0090. We verify these conclusions through a numerical analysis, which makes no slowroll approximation, and carry out a Bayesian parameter estimation and model-selection analysis for a number of inflationary models including monomial, natural, and hilltop potentials. For each model, we present the Planck constraints on the parameters of the potential and explore several possibilities for the post-inflationary entropy generation epoch, thus obtaining nontrivial data-driven constraints. We also present a direct reconstruction of the observable range of the inflaton potential. Unless a quartic term is allowed in the potential, we find results consistent with second-order slow-roll predictions. We also investigate whether the primordial power spectrum contains any features. We find that models with a parameterized oscillatory feature improve the fit by __2 e_ _ 10; however, Bayesian evidence does not prefer these models. We constrain several single-field inflation models with generalized Lagrangians by combining power spectrum data with Planck bounds on fNL. Planck constrains with unprecedented accuracy the amplitude and possible correlation (with the adiabatic mode) of non-decaying isocurvature fluctuations. The fractional primordial contributions of cold dark matter (CDM) isocurvature modes of the types expected in the curvaton and axion scenarios have upper bounds of 0.25% and 3.9% (95% CL), respectively. In models with arbitrarily correlated CDM or neutrino isocurvature modes, an anticorrelated isocurvature component can improve the _2 e_ by approximately 4 as a result of slightly lowering the theoretical prediction for the ` <_ 40 multipoles relative to the higher multipoles. Nonetheless, the data are consistent with adiabatic initial conditions

EDP Sciences OAI-PMH repository (1.2.0)

Repositorio Institucional de la Universidad de Oviedo

Repositorio Institucional Universidad de Granada

Sissa Digital Library

Digital.CSIC

Archivio istituzionale della ricerca - Università di Ferrara

NORA - Norwegian Open Research Archives

Online Research Database In Technology

Archivio istituzionale della ricerca - Università di Padova

Archivio istituzionale della ricerca - Università di Trieste

AIR Universita degli studi di Milano

Aaltodoc Publication Archive

Oxford University Research Archive

Radboud Repository

MURAL - Maynooth University Research Archive Library

Hal - Université Grenoble Alpes

OA@INAF - Istituto Nazionale di Astrofisica

Spiral - Imperial College Digital Repository

VTT Research System

The University of Manchester - Institutional Repository

HAL-CEA

Helsingin yliopiston digitaalinen arkisto

Sussex Research Online

Hal-Diderot

MPG.PuRe

Infoscience - École polytechnique fédérale de Lausanne

HAL-IN2P3

Crossref

Maynooth University ePrints and eTheses Archive

NUI Maynooth Eprint Archive

HAL-OBSPM

CERN Document Server

Archivio istituzionale della ricerca - eCampus Università Telematica

Recommended from our members

Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak

Author: Andersen Kristian G
Birren B. W.
Bochicchio J.
Chapman S. B.
Colubri Andres
Coomber M. R.
Dudas G.
Foday M.
Fonnie M.
Fullah M.
Garry R. F.
Gbakie M.
Gevao S. M.
Gire Stephen
Gladden Adrianne Deanna
Gnirke A.
Goba A.
Grant D. S.
Happi Christian Tientcha
Jalloh A. A.
Jalloh S.
Jiang P.-P.
Kamara F. K.
Kanneh F.
Kanneh L.
Kargbo K.
Khan S. H.
Koninga J.
Konuwa E.
Kovoma A.
Lander Eric Steven
Malboeuf C. M.
Massally J. L. B.
Matranga C. B.
Moigboi A.
Momoh M.
Moses L. M.
Murphy C.
Mustapha I.
Nekoui M.
Nusbaum C.
Park D. J.
Qu J.
Rambaut A.
Robert W.
Sabeti Pardis Christine
Saffa S.
Schaffner Stephen
Scheiffelin J. S.
Sealfon R. S. G.
Sellu J.
Tucker V.
Winnicki Sarah M.
Wohl Shirlee
Yang X.
Yillah M.
Young S.
Yozwiak Nathan
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 30/09/2014
Field of study

In its largest outbreak, Ebola virus disease is spreading through Guinea, Liberia, Sierra Leone, and Nigeria. We sequenced 99 Ebola virus genomes from 78 patients in Sierra Leone to ~2000× coverage. We observed a rapid accumulation of interhost and intrahost genetic variation, allowing us to characterize patterns of viral transmission over the initial weeks of the epidemic. This West African variant likely diverged from central African lineages around 2004, crossed from Guinea to Sierra Leone in May 2014, and has exhibited sustained human-to-human transmission subsequently, with no evidence of additional zoonotic sources. Because many of the mutations alter protein sequences and other biologically meaningful targets, they should be monitored for impact on diagnostics, vaccines, and therapies critical to outbreak response.Organismic and Evolutionary Biolog

Harvard University - DASH

Increasing consistency of disease biomarker prediction across datasets

Author: A Achiron
A Kuhn
AE Teschendorff
AR Abbas
B Zheng
C Riveros
Christos Hatzis
D Arasappan
DD Kang
E Kotelnikova
F Gilli
F Zhang
GC Tseng
H Choi
HJ Eysenck
HJ Eysenck
I Borozan
I Kupershmidt
J ichi Satoh
JA Gagnon-Bartsch
JK Choi
JR Stevens
JT Leek
JT Leek
KS Gandhi
L Ein-Dor
M Gurevich
M Hecker
M Hecker
M Kapushesky
M Zhang
Maria D. Chikina
MM Goldenberg
R Bomprezzi
R Breitling
RA Irizarry
RC Axtell
S Chakraborty
S Lu
SE Bushnell
Stuart C. Sealfon
T Manoli
V Annibali
WI McDonald
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 16/04/2014
Field of study

Microarray studies with human subjects often have limited sample sizes which hampers the ability to detect reliable biomarkers associated with disease and motivates the need to aggregate data across studies. However, human gene expression measurements may be influenced by many non-random factors such as genetics, sample preparations, and tissue heterogeneity. These factors can contribute to a lack of agreement among related studies, limiting the utility of their aggregation. We show that it is feasible to carry out an automatic correction of individual datasets to reduce the effect of such 'latent variables' (without prior knowledge of the variables) in such a way that datasets addressing the same condition show better agreement once each is corrected. We build our approach on the method of surrogate variable analysis but we demonstrate that the original algorithm is unsuitable for the analysis of human tissue samples that are mixtures of different cell types. We propose a modification to SVA that is crucial to obtaining the improvement in agreement that we observe. We develop our method on a compendium of multiple sclerosis data and verify it on an independent compendium of Parkinson's disease datasets. In both cases, we show that our method is able to improve agreement across varying study designs, platforms, and tissues. This approach has the potential for wide applicability to any field where lack of inter-study agreement has been a concern. © 2014 Chikina, Sealfon

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

D-Scholarship@Pitt

FigShare

Misty Mountain clustering: application to fast unsupervised flow cytometry gating

Author: A Cuevas
A Cuevas
AP Dempster
B Scholkopf
BJ Frey
C Fraley
CJC Burges
CW Morris
D Stauffer
G Celeux
G Cornuejols
G Lizard
G Schwarz
GC Tseng
GEP Box
GJ McLachlan
H Hotelling
István P Sugár
J Hoshen
JA Hartigan
JB MacQueen
K Lo
K Lo
KH Knuth
L Boddy
L Boddy
L Breiman
LJ Heyer
M Fiedler
MB Eisen
MF Wilkins
MP Wand
PJ Rousseeuw
PO Krutzik
R Kothari
RF Murphy
RJ Beckman
RL Boyell
RR Brinkman
S Demers
S Kirkpatrick
S Pyne
Stuart C Sealfon
TC Bakker Schut
W Feller
W Jang
W Jang
WE Donath
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background There are many important clustering questions in computational biology for which no satisfactory method exists. Automated clustering algorithms, when applied to large, multidimensional datasets, such as flow cytometry data, prove unsatisfactory in terms of speed, problems with local minima or cluster shape bias. Model-based approaches are restricted by the assumptions of the fitting functions. Furthermore, model based clustering requires serial clustering for all cluster numbers within a user defined interval. The final cluster number is then selected by various criteria. These supervised serial clustering methods are time consuming and frequently different criteria result in different optimal cluster numbers. Various unsupervised heuristic approaches that have been developed such as affinity propagation are too expensive to be applied to datasets on the order of 106 points that are often generated by high throughput experiments. Results To circumvent these limitations, we developed a new, unsupervised density contour clustering algorithm, called Misty Mountain, that is based on percolation theory and that efficiently analyzes large data sets. The approach can be envisioned as a progressive top-down removal of clouds covering a data histogram relief map to identify clusters by the appearance of statistically distinct peaks and ridges. This is a parallel clustering method that finds every cluster after analyzing only once the cross sections of the histogram. The overall run time for the composite steps of the algorithm increases linearly by the number of data points. The clustering of 106 data points in 2D data space takes place within about 15 seconds on a standard laptop PC. Comparison of the performance of this algorithm with other state of the art automated flow cytometry gating methods indicate that Misty Mountain provides substantial improvements in both run time and in the accuracy of cluster assignment. Conclusions Misty Mountain is fast, unbiased for cluster shape, identifies stable clusters and is robust to noise. It provides a useful, general solution for multidimensional clustering problems. We demonstrate its suitability for automated gating of flow cytometry data.</p

Elsevier - Publisher Connector

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central